[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602

dancing-ui · 2025-09-14T13:10:05Z

CLAassistant · 2025-09-14T13:10:11Z

All committers have signed the CLA.

Copilot

Pull Request Overview

This PR adds LLM (Large Language Model) Token Rate Limiting functionality to sentinel-golang, implementing two rate limiting strategies: Fixed Window and PETA (Predictive Estimated Token Allowance). The feature enables token-based rate limiting for LLM API calls with support for multiple token counting strategies and Redis-based distributed rate limiting.

Key Changes:

Implements Fixed Window and PETA token rate limiting strategies with Redis backend
Adds token encoding support (currently OpenAI) for estimating token usage
Provides adapters for Eino framework integration
Includes comprehensive example implementation with Gin middleware

Reviewed Changes

Copilot reviewed 69 out of 102 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`go.mod`	Updates dependencies to support token rate limiting (Redis, tiktoken, copier, etc.)
`pkg/adapters/eino/wrapper.go`	Implements LLM wrapper for Eino framework with token rate limiting
`pkg/adapters/eino/options.go`	Defines configuration options for Eino adapter
`core/llm_token_ratelimit/*.go`	Core implementation of token rate limiting logic, rule management, and strategies
`core/llm_token_ratelimit/script/*.lua`	Redis Lua scripts for atomic rate limiting operations
`example/llm_token_ratelimit/*`	Complete example with Gin server and LLM client integration

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

example/llm_token_ratelimit/main.go

core/llm_token_ratelimit/token_encoder.go

example/llm_token_ratelimit/llm_client/llm_client.go

LearningGp

Overall, the feature functionality is solid with no critical issues (as previously discussed). The rule loading mechanism still needs full alignment with Sentinel's existing datasource architecture. The proposed adjustments can be prioritized in future releases based on your schedule, but could you confirm your plan to proceed or defer them? This implementation is acceptable for the current preview/experimental release phase.

api/slot_chain.go

core/llm_token_ratelimit/config.go

core/llm_token_ratelimit/doc.go

core/llm_token_ratelimit/identifier_checker.go

core/llm_token_ratelimit/rule_manager.go

…ocumentation

dancing-ui added 30 commits July 24, 2025 11:52

feat: add basic llm token rate limit

fffba36

feat: optimize the basic token update method with cache

e2c8056

feat: filter rules and optimize redis's key

13641ed

refactor: wrapper global object

cc2b52a

fix: fix dependency error

134ebdd

feat: add predictive error temporal amortized throttling

4f26090

fix: fix correct error

f4c2f76

fix: fix duplicated token error

ab2c1ae

examples: add new rate limiting application examples

0894b8c

fix: add maximum binary search iterations

e2997dc

refactor: simplify the usage steps of rate limiting

1b69115

test: fix test case errors

ed9b346

refactor: update token encoding ways

8092cb2

fix: fix the issue of excessive goroutines

040e59f

refactor: optimize the calculation method of token prediction

4bf1a19

refactor: optimize the code structure and introduce log auditing

70d0d66

deps: update Redis to v8

177056f

feat: support Redis configuration with multiple addresses

807a501

fix: fix abnormal issues in token correction

4daede3

fix: fix issues with data dependency in token prediction

0123eaa

refactor: delete the unused functions

6ab0495

fix: fix token prediction accuracy and response header issues

ecad15b

fix: fix missing response header issue in fixed window strategy

11896b8

feat: adapt to eino framework; fix initialization issues

2cfb2bb

fix: fix go.yml test case path error

2372981

feat: optimize response header information

1cb6955

fix: fix go.yml error

73e923c

remove: remove dependency test cases

5980541

feat: remove binding hit mechanism between input token and total token

c433503

test: add identifier_checker and rule_collector unit test cases

48c9bb4

dancing-ui added 11 commits September 6, 2025 18:33

fix: fix lint error

d9f372e

test: add context, request_info, util unit test cases

36fc56f

fix: fix lint error

2b721d6

test: add rule_filter unit test cases

6203b55

test: add resource benchmark test example

f3d015e

feat: support multi-architecture Redis service

82b629d

feat: add metric logger

acf9702

test: add all unit test cases

8e89592

docs: add llm token rate limit integration steps

b11e357

docs: add llm token rate limit adapter usage

311ca30

docs: update llm token rate limit usage

930604d

LearningGp added kind/feature Category issues or PRs related to feature request to-review PRs to review area/flow-control Issues or PRs related to flow control labels Oct 23, 2025

LearningGp requested a review from Copilot October 23, 2025 01:48

Copilot AI reviewed Oct 23, 2025

View reviewed changes

example/llm_token_ratelimit/main.go Outdated Show resolved Hide resolved

core/llm_token_ratelimit/token_encoder.go Outdated Show resolved Hide resolved

example/llm_token_ratelimit/llm_client/llm_client.go Outdated Show resolved Hide resolved

style: fix spelling errors

3c0c6b1

LearningGp requested changes Oct 24, 2025

View reviewed changes

dancing-ui added 2 commits October 24, 2025 13:58

refactor: remove config struct fields for loose coupling and update d…

1094f0d

…ocumentation

feat: adapt llm_token_ratelimit component to datasource module

0361ecf

LearningGp changed the base branch from master to ospp/llm-token-rate-limit October 28, 2025 03:20

LearningGp merged commit 8cd4cfa into alibaba:ospp/llm-token-rate-limit Oct 28, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602

[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602

Uh oh!

dancing-ui commented Sep 14, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Sep 14, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LearningGp left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602

[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602

Uh oh!

Conversation

dancing-ui commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LearningGp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dancing-ui commented Sep 14, 2025 •

edited

Loading

CLAassistant commented Sep 14, 2025 •

edited

Loading